Entropy-driven dynamics and robust learning procedures in games
نویسندگان
چکیده
In this paper, we introduce a new class of game dynamics made of a pay-off replicator-like term modulated by an entropy barrier which keeps players away from the boundary of the strategy space. We show that these entropy-driven dynamics are equivalent to players computing a score as their on-going exponentially discounted cumulative payoff and then using a quantal choice model on the scores to pick an action. This dual perspective on entropy-driven dynamics helps us to extend the folk theorem on convergence to quantal response equilibria to this case, for potential games. It also provides the main ingredients to design a discrete time effective learning algorithm that is fully distributed and only requires partial information to converge to QRE. This convergence is resilient to stochastic perturbations and observation errors and does not require any synchronization between the players. Key-words: Reinforcement Learning, Stochastic Approximation, Quantal Response Equilibria, Entropy ∗ PRiSM, University of Versailles, 45 avenue des Etats-Unis, 78035 Versailles, France † Inria and University of Grenoble (LIG), 38330 Grenoble, France ‡ French National Center for Scientific Research (CNRS) and University of Grenoble (LIG), 38330 Grenoble, France ha l-0 07 90 81 5, v er si on 1 21 F eb 2 01 3 Dynamiques de jeux avec un terme d’entropie et leurs propriétés de résilience Résumé : Dans cet article, nous introduisons une nouvelle classe de dynamiques de jeux avec un terme de gain, de type réplication, modulé par une barrière entropique qui permet de maintenir les stratégies des joueurs loin des frontières du domaine. Nous montrons que ces dynamiques qui ont un terme d’entropie peuvent aussi être obtenues par des joueurs qui maintiennent un score sous la forme de leur gain cumulé actualisé et qui sélectionnent leurs actions sous la forme d’une réponse quantifiée à leur score courant. Cette double vision de la dynamique permet d’établir le théorème fondamental de convergence vers les points fixes de la réponse quantifié (qui sont proches des équilibres de Nash), dans le cas des jeux de potentiel. Elle permet aussi de mettre au point un algorithme discret effectif, complètement décentralisé et qui n’utilise que les données locales accessibles à chaque joueur, pour converger vers les points fixes de la dynamique. Cette convergence est conservée en présence de perturbations aléatoires et d’erreurs de mesure et ne nécessite pas de synchronisation entre les joueurs. Mots-clés : Apprentissage par renforcement, approximation stochastique, équilibres de réponse quantitative, entropie ha l-0 07 90 81 5, v er si on 1 21 F eb 2 01 3 Entropy-driven game dynamics 3
منابع مشابه
Robust Fractional-order Control of Flexible-Joint Electrically Driven Robots
This paper presents a novel robust fractional PIλ controller design for flexible joint electrically driven robots. Because of using voltage control strategy, the proposed approach is free of problems arising from torque control strategy in the design and implementation. In fact, the motor's current includes the effects of nonlinearities and coupling in the robot manipulator. Therefore, cancella...
متن کاملRobust Fractional-order Control of Flexible-Joint Electrically Driven Robots
This paper presents a novel robust fractional PIλ controller design for flexible joint electrically driven robots. Because of using voltage control strategy, the proposed approach is free of problems arising from torque control strategy in the design and implementation. In fact, the motor's current includes the effects of nonlinearities and coupling in the robot manipulator. Therefore, cancella...
متن کاملStability Analysis and Robust PID Control of Cable Driven Robots Considering Elasticity in Cables
In this paper robust PID control of fully-constrained cable driven parallel manipulators with elastic cables is studied in detail. In dynamic analysis, it is assumed that the dominant dynamics of cable can be approximated by linear axial spring. To develop the idea of control for cable robots with elastic cables, a robust PID control for cable driven robots with ideal rigid cables is firstly de...
متن کاملRobust Control of Electrically Driven Robots in the Task Space
In this paper, a task-space controller for electrically driven robot manipulators is developed using a robust control algorithm. The controller is designed using voltage control strategy. Based on the nominal model of the robotic arm, the desired signals for motor currents are calculated and then the voltage control law is proposed based on the current errors and motor nominal electrical model....
متن کاملRobust Control of Electrically Driven Robots in the Task Space
In this paper, a task-space controller for electrically driven robot manipulators is developed using a robust control algorithm. The controller is designed using voltage control strategy. Based on the nominal model of the robotic arm, the desired signals for motor currents are calculated and then the voltage control law is proposed based on the current errors and motor nominal electrical model....
متن کاملRobust Fractional Order Control of Under-actuated Electromechanical System
This paper presents a robust fractional order controller for flexible-joint electrically driven robots under imperfect transformation of control space. The proposed approach is free from manipulator dynamics, thus free from problems associated with torque control strategy in the design and implementation. As a result, the proposed controller is simple, fast response and superior to the torque c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1303.2270 شماره
صفحات -
تاریخ انتشار 2013